knitr::opts_chunk$set(fig.width = 6, fig.height = 4, # fig.path = 'Figs/',
echo = TRUE, message = FALSE, warning = FALSE)
library(tidyverse)
library(RColorBrewer)
### The book has an R package!
library(primer) # install.packages('primer')Casey’s note: in this chapter, if we’re using matrices for a lot of things, not sure that Tidyverse functions will help us too much (they’re focused on data.frames). So maybe our coding practice can focus on expanding the examples in the text, and/or writing functions to apply the examples?
Let’s define two \(2 \times 2\) matrices, filling in one by rows, and the other by columns.
M <- matrix(1:4, nr = 2, byrow = T)
# [,1] [,2]
# [1,] 1 2
# [2,] 3 4
N <- matrix(c(10, 20, 30, 40), nr = 2)
# [,1] [,2]
# [1,] 10 30
# [2,] 20 40Following our rules above, we would multiply and then sum the first row of \(M\) by the first column of \(N\), and make this element \(a_{11}\) of the resulting matrix product.
1 * 10 + 2 * 20## [1] 50
# [1] 50
### We multiply matrices using %*% to signify that we mean matrix multiplication.
M %*% N## [,1] [,2]
## [1,] 50 110
## [2,] 110 250
# [,1] [,2]
# [1,] 50 110
# [2,] 110 250First, we create a population projection matrix, and a vector of stage class abundances for year zero.
A <- matrix(c(0, 0.5, 20, 0.3, 0, 0, 0, 0.5, 0.9), nr = 3,
byrow = TRUE)
N0 <- matrix(c(100, 250, 50), ncol = 1)Now we perform matrix multiplication between the projection matrix and N0.
N1 <- A %*% N0
# [,1]
# [1,] 1125
# [2,] 30
# [3,] 170Note that the first stage declined, while the second and third stages increased.
Now we project our population over six years, using a for-loop. We use a for-loop, rather than sapply, because each year depends on the previous year (see the Appendix, sec. B.6). First, we set the number of years we want to project, and then create a matrix to hold the results. We put N0 in the first column.
years <- 6
N.projections <- matrix(0, nrow = nrow(A), ncol = years + 1)
N.projections[, 1] <- N0Now we perform the iteration with the for-loop.
Casey’s note: this is a weird way of using for loops - on a single line. Reformat it using Hadley’s style guide
for (i in 1:years) N.projections[, i + 1] <- A %*% N.projections[, i]###Mireia's version: loop using Hadley's style
for (i in 1:years) {
N.projections[, i + 1] <- A %*% N.projections[, i]
}Last, we graph the results for each stage (Fig. 2.3a). To graph a matrix, R is expecting that the data will be in columns, not rows, and so we need to transpose the projection matrix.
matplot(0:years, t(N.projections), type = "l", lty = 1:3,
col = 1, ylab = "Stage Abundance", xlab = "Year")
legend("topleft", legend = c("Seeds", "Small Adult", "Large Adult"),
lty = 1:3, col = 1, bty = "n")Now let’s calculate \(R_t = N_{t+1}/N_t\) for each year \(t\). We first need to sum all the stages, by applying the sum function to each column.
N.totals <- apply(N.projections, 2, sum)
### Now we get each Rt by dividing all the Nt+1 by each Nt.
### Using negative indices cause R to drop that element.
Rs <- N.totals[-1]/N.totals[-(years + 1)]
### We have one fewer Rs than we do years, so let’s plot each R
### in each year t, rather than each year t + 1 (Fig. 2.3b).
plot(0:(years - 1), Rs, type = "b", xlab = "Year", ylab = "R")Here we perform eigenanalysis on A.
eigs.A <- eigen(A)
# $values
# [1] 1.834+0.000i -0.467+1.159i -0.467-1.159i
# $vectors
# [,1] [,2] [,3]
# [1,] 0.98321+0i 0.97033+0.00000i 0.97033+0.00000i
# [2,] 0.16085+0i -0.08699-0.21603i -0.08699+0.21603i
# [3,] 0.08613+0i -0.02048+0.06165i -0.02048-0.06165iEach eigenvalue and its corresponding eigenvector provides a solution to eq. 2.8.
Next we explicitly find the index position of the largest absolute value of the eigenvalues. In most cases, it is the first eigenvalue.
dom.pos <- which.max(eigs.A[["values"]])We use that index to extract the largest eigenvalue. We keep the real part, using Re, dropping the imaginary part. (Note that although the dominant eigenvalue will be real, R will include an imaginary part equal to zero (0i) as a place holder if any of the eigenvalues have a non-zero imaginary part).
L1 <- Re(eigs.A[["values"]][dom.pos])
# [1] 1.834L1 is \(\lambda_1\), the aysmptotic finite rate of increase.
Because growth is an exponential process, we can figure out what is most important in a projection matrix by multiplying it by the stage structure many times. This is actually one way of performing eigenanalysis, and it is called the power iteration method. It is not terribly efficient, but it works well in some specific applications. (This method is not used by modern computational languages such as R.) The population size will grow toward infinity, or shrink toward zero, so we keep rescaling \(N\), dividing the stages by the total \(N\), just to keep things manageable. Let \(t\) be big, and rescale \(N\).
t <- 20
Nt <- N0/sum(N0)We then create a for-loop that re-uses Nt for each time step, making sure we have an empty numeric vector to hold the output.
Casey’s note: this is an even weirder format for a for loop - reformat it! I made a correction to the transcription, now the for loop it is writen as it is in the book
R.t <- numeric(t)
for (i in 1:t) R.t[i] <- {
Nt1 <- A %*% Nt
R <- sum(Nt1)/sum(Nt)
Nt <- Nt1/sum(Nt1)
R
}###Mireia's version: reformating for loop
R.t <- numeric(t)
for (i in 1:t) {
R.t[i] <- {
Nt1 <- A %*% Nt
R <- sum(Nt1)/sum(Nt)
Nt <- Nt1/sum(Nt1)
R
}
} Let’s compare the result directly to the point estimate of \(\lambda_1\) (Fig. 2.4).
par(mar = c(5, 4, 3, 2))
plot(1:t, R.t, type = "b",
main = quote("Convergence Toward " * lambda))
points(t, L1, pch = 19, cex = 1.5)The dominant eigenvector, \(\mathbf w\), is in the same position as the dominant eigenvalue. We extract \(\mathbf w\), keeping just the real part, and divide it by its sum to get the stable stage distribution.
w <- Re(eigs.A[["vectors"]][, dom.pos])
ssd <- w/sum(w)
# round(ssd, 3)
# [1] 0.799 0.131 0.070This shows us that if the projection matrix does not change over time, the popu- lation will eventually be composed of 80% seeds, 13% small adults, and 7% large adults. Iterating the population projection will also eventually provide the stable stage distribution (e.g., Fig. 2.3a).
We get the left eigenvalues and -vectors by performing eigenanalysis on the transpose of the projection matrix. The positions of the dominant right and left eigenvalues are the same, and typically they are the first. We perform eigenanalysis, extracting just the the dominant left eigenvector; we then scale it, so the stage 1 has a reproductive value of 1.0.
M <- eigen(t(A))
v <- Re(M$vectors[, which.max(Re(M$values))])
RV <- v/v[1]
# [1] 1.000 6.113 21.418Here we see a common pattern, that reproductive value, \(v\), increases with age. In general, reproductive value of individuals in a stage increases with increasing probability of reaching fecund stages.
Let’s calculate sensitivities now. First we calculate the numerator for eq. 2.13.
vw.s <- v %*% t(w)Now we sum these to get the denominator, and then divide to get the sensitivities. (The dot product \(\mathbf v \cdot \mathbf w\) yields a \(1 \times 1\) matrix; in order to divide by this quantity, the simplest thing is to cause the dot product to be a simple scalar rather than a matrix (using as.numeric), and then R will multiply each element.)
S <- vw.s/as.numeric(v %*% w)
# [,1] [,2] [,3]
# [1,] 0.258 0.04221 0.0226
# [2,] 1.577 0.25798 0.1381
# [3,] 5.526 0.90396 0.4840We see from this that the most important transition exhibited by the plant is \(s_{21}\), surviving from the seed stage to the second stage (the element \(s_{31}\) is larger, but is not a transition that the plant undergoes).
In R, this is also easy.
elas <- (A/L1) * S
# round(elas, 3)
# [,1] [,2] [,3]
# [1,] 0.000 0.012 0.246
# [2,] 0.258 0.000 0.000
# [3,] 0.000 0.246 0.238Let’s import the data and have a look at it. For these purposes, we will assume that the data are clean and correct. Obviously, if I were doing this for the first time, data-checking and clean-up would be an important first step. Here we simply load them from the primer package.
data(stagedat)
data(fruitdat)
data(seeddat)Now I look at the structure of the data to make sure it is at least approximately what I think it is.
str(stagedat)## 'data.frame': 414 obs. of 4 variables:
## $ PalmNo: int 1 2 3 4 5 6 7 8 9 10 ...
## $ Y2003 : int 4 5 5 4 3 2 4 3 3 4 ...
## $ Y2004 : int 5 4 5 5 4 3 5 3 4 4 ...
## $ Y2005 : int 5 5 5 5 4 3 5 4 4 5 ...
The stage data provide the stage of each individual in the study. Each row is an individual, and its ID number is in column 1. Data in columns 2–4 identify its stage in years 2003–2005.
We can count, or tabulate, the number of individuals in each stage in 2004.
table(stagedat[["Y2004"]])##
## 0 2 3 4 5
## 17 58 48 126 165
We see, for instance, that in 2004 there were 165 individuals in stage 5. We also see that 17 individuals were dead in 2004 (stage = 0); these were alive in either 2003 or 2005.
The fruit data have a different structure. Each row simply identifies the stage of each individual (col 1) and its fertility (number of seeds) for 2004.
str(fruitdat)## 'data.frame': 68 obs. of 2 variables:
## $ Stage: int 4 4 4 4 4 4 4 4 4 4 ...
## $ Y2004: int 6 0 0 0 0 0 0 0 0 0 ...
We can tabulate the numbers of seeds (columns) of each stage (rows).
table(fruitdat[["Stage"]], fruitdat[["Y2004"]])##
## 0 1 2 3 4 5 6 8 15 22 30 37 70 98 107 109
## 4 28 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
## 5 23 1 1 1 2 2 0 1 1 1 1 1 1 1 1 1
For instance, of the individuals in stage 4 (row 1), 28 individuals had no seeds, and one individual had 6 seeds. Note also that only stage 4 and 5 had plants with any seeds.
The seed data are the fates of each seed in a sample of 400 seeds, in a data frame with only one column.
table(seeddat)## seeddat
## 0 1 2
## 332 11 57
Seeds may have germinated (2), remained viable (1), or died (0).
Now we work through the steps to create the projection matrix from individuals tagged in year 2003 and re-censused in 2004. If we convert the life cycle graph (Fig. 2.5) into a transition matrix.
Along the major diagonal (where \(i = j\)) the \(P_{ij}\) represent the probability that a palm stays in the same stage. In the lower off-diagonal \((i > j)\) the \(P_{ij}\) represent the probability of growth, that an individual grows from stage \(j\) into stage \(i\). In the upper off-diagonal \((i < j)\) the \(P_{ij}\) represent the probability of regression, that an individual regresses from stage \(j\) back into stage \(i\). The \(F_i\) represent the fertility of stage \(i\).
As a practical matter, we will use basic data manipulation in R to transform the raw data into transition elements. We had no particular reason for having the data in this form, this is simply how the data were available.
\[\begin{pmatrix} P_{11} & 0 & 0 & F_4 & F_5\\ P_{21} & P_{22} & P_{23} & 0 & 0\\ 0 & P_{32} & P_{33} & P_{34} & 0\\ 0 & 0 & P_{43} & P_{44} & P_{45}\\ 0 & 0 & 0 & P_{54} & P_{55} \end{pmatrix}\]
We first create a zero matrix that we will then fill.
mat1 <- matrix(0, nrow = 5, ncol = 5)For each stage, we get mean fertility by applying mean to each stage of the 2004 fertility data. Here Stage is a factor and tapply will caculate a mean for each level of the factor. We will assume that half of the seeds are male. Therefore, we divide fertility by 2 to calculate the fertility associated with just the female seeds.
ferts <- tapply(fruitdat$Y2004, fruitdat$Stage, mean)/2
ferts## 4 5
## 0.1034483 6.6666667
These fertilities, \(F_4\) and \(F_5\), are the transitions from stages 4 and 5 (adults) to stage 1 (seeds). Next we insert the fertilities (ferts) into the matrix we established above.
mat1[1, 4] <- ferts[1]
mat1[1, 5] <- ferts[2]Now we get the frequencies of each seed fate (die, remain viable but dormant, or germinate), and then divide these by the number of seeds tested (the length of the seed vector); this results in proportions and probabilities.
seed.freqs <- table(seeddat[, 1])
seedfates <- seed.freqs/length(seeddat[, 1])
seedfates##
## 0 1 2
## 0.8300 0.0275 0.1425
The last of these values is \(P_{21}\), the transition from the first stage (seeds) to the stage 2 (seedlings). The second value is the transition of seed dormancy (\(P_{11}\)), that is, the probability that a seed remains a viable seed rather than dying or becoming a seedling.
Next we insert the seed transitions into our projection matrix.
mat1[1, 1] <- seedfates[2]
mat1[2, 1] <- seedfates[3]Here we calculate the transition probabilities for the vegetative stages. The pair of for-loops will calculate these transitions and put them into stages 2–5.The functions inside the for-loops (a) subset the data for each stage in 2003, (b) count the total number of individuals in each stage in 2003 (year j), (c) sum the number of individuals in each stage in 2004, given each stage for 2003, and then (d) calculate the proportion of each stage in 2003 that shows up in each stage in 2004.
for (i in 2:5) {
for (j in 2:5) mat1[i, j] <- {
x <- subset(stagedat, stagedat$Y2003 == j)
jT <- nrow(x)
iT <- sum(x$Y2004 == i)
iT/jT
}
}
round(mat1, 2)## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.03 0.00 0.00 0.10 6.67
## [2,] 0.14 0.70 0.05 0.01 0.00
## [3,] 0.00 0.23 0.42 0.04 0.00
## [4,] 0.00 0.00 0.46 0.67 0.07
## [5,] 0.00 0.00 0.02 0.26 0.90
Here we can see the key parts of a real projection matrix.
Compare these probabilities and fertilities to the life cycle graph and its matrix (Fig. 2.5, eq. (2.15)).
The diagonal elements \(P_{jj}\) are stasis probabilities, that an individual remains in that stage. Growth, from one stage to the next, is the lower off-diagonal, \(P_{j+1,j}\). Regression, moving back one stage, is the upper off diagonal, \(P_{j−1,j}\). The fertilities are in the top row, in columns 4 and 5. Note that there is a transition element in our data that is not in eq. (2.15): \(P_{53}\). This corresponds to very rapid growth — a real event, albeit highly unusual.
What a lot of work! The beauty, of course, is that we can put all of those lines of code into a single function, called, for instance, ProjMat, and all we have to supply are the three data sets. You could examine this function by typing ProjMat on the command line, with no parentheses, to see the code and compare it to our code above. You code also try it with data.
ProjMat(stagedat, fruitdat, seeddat)## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.0275 0.0000000 0.00000000 0.103448276 6.66666667
## [2,] 0.1425 0.7012987 0.05084746 0.007518797 0.00000000
## [3,] 0.0000 0.2337662 0.42372881 0.037593985 0.00000000
## [4,] 0.0000 0.0000000 0.45762712 0.669172932 0.06896552
## [5,] 0.0000 0.0000000 0.01694915 0.255639098 0.89655172
This provides the observed transition matrix (results not shown).
Next we want to do all those eigenanalyses and manipulations that gave us \(\lambda\), the stable age distribution,reproductive value, and the sensitivity and elasticity matrices. All of this code is wrapped up in the function DemoInfo. Convince yourself it is the same code by typing DemoInfo with no parentheses at the prompt. Here we try it out on the projection matrix we created above, and examine the components of the output.
str(DemoInfo(mat1))## List of 6
## $ lambda : num 1.13
## $ SSD : num [1:5] 0.5632 0.195 0.0685 0.0811 0.0922
## $ RV : num [1:5] 1 7.76 14.37 20.18 33.95
## $ Sensitivities: num [1:5, 1:5] 0.0719 0.5587 1.0339 1.4521 2.4424 ...
## $ Elasticities : num [1:5, 1:5] 0.00174 0.0702 0 0 0 ...
## $ PPM : num [1:5, 1:5] 0.0275 0.1425 0 0 0 ...
# List of 6
# $ lambda
# $ SSD
# $ RV
# : num 1.13
# : num [1:5] 0.5632 0.195 0.0685 0.0811 0.0922
# : num [1:5] 1 7.76 14.37 20.18 33.95
# $ Sensitivities: num [1:5, 1:5] 0.072 0.559 1.034 1.452 2.442 ...
# $ Elasticities : num [1:5, 1:5] 0.00174 0.0702 0 0 0 ...
# $ PPM : num [1:5, 1:5] 0.0275 0.1425 0 0 0 ...We find that DemoInfo returns a list with six named components. The first component is a scalar, the second two are numeric vectors, and the last three are numeric matrices. The last of these is the projection matrix itself; it is often useful to return that to prove to ourselves that we analyzed the matrix we intended to.
All of the above was incredibly useful and provides the best estimates of most or all the parameters we might want. However, it does not provide any idea of the certainty of those parameters. By bootstrapping these estimates by resampling our data, we get an idea of the uncertainty.
Here we work through the steps of resampling our data, as we build a function, step by step, inch by inch. The basic idea of resampling is that we assume that our sample data are the best available approximation of the entire popu- lation. Therefore, we draw, with replacement, new data sets from the original one. See the last section in Chapter 1 for ideas regarding simulations and bootstrapping.
We will create new resampled (bootstrapped) data sets, where the rows of the original data sets are selected at random with replacement. We then apply ProjMat and DemoInfo.
The first step is to get the number of observations in the original data.
nL <- nrow(stagedat)
nF <- nrow(fruitdat)
nS <- nrow(seeddat)With these numbers, we will be able to resample our original data sets getting the correct number of resampled observations.
Now we are going to use lapply to perform everything multiple times. By “everything,” I mean
All of that is one replicate simulation, \(n = 1\). For now, let’s say n = 5 times as a trial. Eventually this step is the one we will ask R to do 1000 or more times.
n <- 5Next we use lapply to do everything, that is, a replicate simulation, \(n\) times. It will store the \(n\) replicates in a list, \(n\) components long. Each of the \(n\) components will be the output of DemoInfo, which is itself a list.
Casey’s note: the
lapply()here has a function built on-the-fly. What is actually going into the output? throw areturn()in there just to make it absolutely clear…
n <- 5
out <- lapply(1:n, function(i) {
stageR <- stagedat[sample(1:nL, nL, replace = TRUE), ]
fruitR <- fruitdat[sample(1:nF, nF, replace = TRUE), ]
seedR <- as.data.frame(seeddat[sample(1:nS, nS, replace = TRUE), ])
matR <- ProjMat(stagedat = stageR, fruitdat = fruitR, seeddat = seedR)
DemoInfo(matR)
})This code above uses sample to draw row numbers at random and with replacement to create random draws of data (stageR, fruitR, and seedR). We then use ProjMat to generate the projection matrix with the random data, and use DemoInfo to perform all the eigenanalysis and demographic calculations.
Let’s look at a small subset of this output, just the five \(\lambda\) generated from five different bootstrapped data sets. The object out is a list, so using sapply on it will do the same thing to each component of the list. In this case, that something is to merely extract the bootstrapped \(\lambda\).
sapply(out, function(x) x$lambda)## [1] 1.106350 1.158114 1.121853 1.103342 1.128336
# [1] 1.084 1.137 1.134 1.126 1.158We see that we have five different estimates of \(\lambda\), each the dominant eigenvalue of a projection matrix calculated from bootstrapped data. We now have all the functions we need to analyze these demographic data. I have put all these together in a function called DemoBoot, whose arguments (inputs) are the raw data, and \(n\), the number of bootstrapped samples.
args(DemoBoot)## function (stagedat = NULL, fruitdat = NULL, seeddat = NULL, n = 1)
## NULL
# function (stagedat = NULL, fruitdat = NULL, seeddat = NULL, n = 1)
# NULLNow we are armed with everything we need, including estimates and means to evaluate uncertainty, and we can move on to the ecology. We first interpret point estimates of of demographic information, including \(\lambda\) and elasticities. Then we ask whether \(\lambda\) differs significantly from 1.0 using our bootstrapped confidence interval.
First, point estimates based on original data.
estims <- DemoInfo(ProjMat(stagedat, fruitdat, seeddat))
estims$lambda## [1] 1.133944
# [1] 1.134Our estimate of \(\lambda\) is greater than one, so the population seems to be growing. Which transitions seem to be the important ones?
round(estims$Elasticities, 4)## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.0017 0.0000 0.0000 0.0009 0.0693
## [2,] 0.0702 0.1196 0.0030 0.0005 0.0000
## [3,] 0.0000 0.0738 0.0470 0.0049 0.0000
## [4,] 0.0000 0.0000 0.0712 0.1234 0.0145
## [5,] 0.0000 0.0000 0.0044 0.0793 0.3162
It appears that the most important transition is persistence in the largest adult stage (\(a_{55}\) = 0.3). Specifically, proportional changes to the persistence in this stage, neither regressing nor dying, are predicted to have the largest postive effect on the lambda of this population.
We stated above that the population appears to be growing. However, this was based on a sample of the population, and not the entire population. One way to make inferences about the population is to ask whether the confidence interval for \(\lambda\) lies above 1.0. Let’s use DemoBoot to bootstrap our confidence interval for \(\lambda\). First, we’ll run the bootstrap, and plot the \(\lambda\)’s.
system.time(out.boot <- DemoBoot(stagedat, fruitdat, seeddat,
n = 1000))## user system elapsed
## 3.211 0.088 3.334
# user system elapsed
# 12.539 0.022 12.561
lambdas <- sapply(out.boot, function(out) out$lambda)
hist(lambdas, prob = T)
lines(density(lambdas))From this it seems clear that the population is probably growing (\(\lambda\) > 1.0), because the lower limit of the histogram is relatively large (Fig. 2.6). We need to get a real confidence interval, however. Here we decide on a conventional α and then calculate quantiles, which will provide the median (the 50th percentile), and the lower and upper limits to the 95% confidence interval.
alpha <- 0.05
quantile(lambdas, c(alpha/2, 0.5, 1 - alpha/2))## 2.5% 50% 97.5%
## 1.062207 1.128944 1.194630
# 2.5% 50% 97.5%
# 1.062 1.129 1.193From this we see that the 95% confidence interval (i.e. the 0.025 and 0.975 quantiles) does not include 1.0. Therefore, we conclude that under the conditions experienced by this population in this year, this Chamaedorea population, from which we drew a sample, could be characterized as having a long-term asymptotic growth rate, \(\lambda\), that is greaater than 1.0, and therefore would be likely to increase in abundance, if the environment remains the same.
Bootstrapping as we have done above, known variously as the basic or percentile bootstrap, is not a cure-all, and it can give inappropriate estimation and inferrence under some circumstances. A number of refinements have been proposed that make bootstrapping a more precise and accurate procedure. The problems are worst when the data are such that the bootstrap replicates are highly skewed, so that the mean and median are quite different. When the data are relatively symmetric, as ours is (Fig. 2.6), the inference is relatively reliable. Often skewness will cause the mean of the bootstrap samples to differ from our observed estimate, and we refer to this as bias. We should adjust the boot- strapped samples for this bias [140]. Here we calculate the bias.
bias <- mean(lambdas) - estims$lambda
bias## [1] -0.005381649
# [1] -0.004208We find that the bias is very small; this gives us confidence the our confidence intervals are pretty good. Nonetheless, we can be thorough and correct our samples for this bias. We subtract the bias from the bootstrapped \(\lambda\) to get our confidence interval.
quantile(lambdas - bias, c(alpha/2, 0.5, 1 - alpha/2))## 2.5% 50% 97.5%
## 1.067589 1.134325 1.200011
# 2.5% 50% 97.5%
# 1.067 1.133 1.197These bias-corrected quantiles also indicate that this population in this year can be characterized by a \(\lambda\) > 1.
If we want to infer something about the future success of this population, we need to make additional assumptions. First, we must assume that our sample was representative of the population; we have every reason to expect it is. Second, we need to assume that this year was representative of other years. In particular, we need to assume that the weather, the harvest intensity, and the browsing intensity are all representative. Clearly, it would be nice to repeat this for other years, and to try to get other sources of information regarding these factors.
Goldenseal (Hydrastis canadensis) is a wild plant with medicinal properties that is widely harvested in eastern North American. Its rhizome (the thick underground stem) is dug up, and so harvesting can and frequently does have serious negative impacts on populations. A particular population of goldenseal is tracked over several years and investigators find, tag, and monitor several sizes of individuals. After several years of surveys, they identify six relevant stages: dormant seed, seedling, small 1-leaved plant, medium 1-leaved plant, large 1-leaved plant, fertile plant (flowering, with 2 leaves). They determine that the population project matrix is:
\[A = \begin{pmatrix} 0 & 0 & 0 & 0 & 0 & 1.642\\ 0.098 & 0 & 0 & 0 & 0 & 0.437 \\ 0 & 0.342 & 0.591 & 0.050 & 0.095 & 0\\ 0 & 0.026 & 0.295 & 0.774 & 0.177 & 0.194\\ 0 & 0 & 0 & 0.145 & 0.596 & 0.362\\ 0 & 0 & 0 & 0.016 & 0.277 & 0.489 \end{pmatrix}\]
library(diagram)
proj_mat <- c( 0, 0, 0, 0, 0, 1.642,
0.098, 0, 0, 0, 0, 0.437,
0, 0.342, 0.591, 0.050, 0.095, 0,
0, 0.026, 0.295, 0.774, 0.177, 0.194,
0, 0, 0, 0.145, 0.596, 0.362,
0, 0, 0, 0.016, 0.277, 0.489) %>%
matrix(nrow = 6, byrow = TRUE)
stage_names <- c("seed","seedling","S 1-leaf","M 1-leaf","L 1-leaf","fertile plant")
proj_df <- as.data.frame(proj_mat) %>% setNames(stage_names) %>% `rownames<-` (stage_names) ### is there a cleaner way to set all row names in dplyr pipe?
curves <- matrix(nrow = ncol(proj_df), ncol = ncol(proj_df), .7)
curves[2, 1] <- curves[3, 2] <- curves[4, 3] <- curves[5, 4] <- curves[6, 5] <- 0 ### setting all arrows from one stage to subsequent stage to be straight, while remainder will be curved
plotmat(proj_df, pos = 6, curve = curves,
box.size = 0.05, shadow.size = 0, box.cex = .5, cex.txt = .5, my = -0.13,
self.cex = .5, self.shiftx = 0, self.shifty = -.07 ### can't get arrows to reappear on 'self' loops once I've changed the position...
)### regurgitating chapter code
N0 <- matrix(c(0, 10, 10, 10, 10, 10))
years <- 10
N.projections <- matrix(0, nrow = nrow(proj_mat), ncol = years + 1)
N.projections[, 1] <- N0
for (i in 1:years) {
N.projections[, i + 1] <- proj_mat %*% N.projections[, i]
}
matplot(0:years, t(N.projections), type = "l", lty = 1:6,
col = 1, ylab = "Stage Abundance", xlab = "Year")### using adaptation of vincent's/casey's functions:
project_pop <- function(proj_mat, N0, years){
### set up
n_stages <- dim(proj_mat)[1]
pop <- N0
results_mat <- matrix(nrow = years, ncol = n_stages)
### loop
for (yr in 1:years){
pred_pop <- proj_mat %*% pop ### results will be long (single column)
results_mat[yr, ] <- t(pred_pop) ### transpose results from single column to single row, results_mat stores each year as a row
pop <- pred_pop ### update current population
}
### return results as matrix
results_mat <- rbind(t(N0), results_mat) ### add year zero, transposed to add as single row
return(results_mat)
}
pop_proj_df <- project_pop(proj_mat = proj_mat, N0 = c(0, 10, 10, 10, 10, 10), years = 10) %>%
as.data.frame() %>%
setNames(stage_names) %>%
mutate(year = 0:(n() - 1)) %>%
gather(stage, pop, -year) %>% ### column names become 'stage' variable, 'year' column is retained
mutate(stage = fct_inorder(stage)) ### clean way to keep stage names in order for graph - thanks Casey!
ggplot(pop_proj_df, aes(x = year, y = pop, color = as.factor(stage))) +
geom_line() +
labs(x = "Year", y = "Stage abundance", color="Stage") +
theme_classic()eigs_right <- eigen(proj_mat) ### determine eigenvalues and vectors for projection matrix
dom.pos <- which.max(eigs_right[["values"]]) ### determine dominant eigenvalue
w <- Re(eigs_right[["vectors"]][, dom.pos]) ### determine corresponding eigenvector (w)
ssd <- w/sum(w) ### divide each vector value by total so you end up with proportional values
round(ssd, 3)## [1] 0.168 0.061 0.124 0.343 0.196 0.107
### lambda is determined by dominant eigenvalue of the projection matrix
lambda <- Re(eigs_right[["values"]][dom.pos])
lambda## [1] 1.047113
From book: “Sensitivity and elasticity combine these to tell us the relative importance of each transition in determining \(\lambda\)” - Sensitivities: changes in \(\lambda\) given changes in each element of the transition matrix - Elasticities: sensitivities weighted by transition probabilities
eigs_left <- eigen(t(proj_mat))
eigs_left## eigen() decomposition
## $values
## [1] 1.04711296+0.0000000i 0.61574297+0.1387211i 0.61574297-0.1387211i
## [4] 0.11181446+0.0827344i 0.11181446-0.0827344i -0.05222783+0.0000000i
##
## $vectors
## [,1] [,2] [,3]
## [1,] 0.008923221+0i -0.04464978+0.01928285i -0.04464978-0.01928285i
## [2,] 0.095343064+0i -0.30783393+0.05795318i -0.30783393-0.05795318i
## [3,] 0.261211445+0i -0.57407602+0.00000000i -0.57407602+0.00000000i
## [4,] 0.403870936+0i -0.04815033-0.26995415i -0.04815033+0.26995415i
## [5,] 0.600994083+0i 0.49089119+0.20082956i 0.49089119-0.20082956i
## [6,] 0.631104616+0i 0.16206562+0.43264923i 0.16206562-0.43264923i
## [,4] [,5] [,6]
## [1,] -0.06431903+0.05611312i -0.06431903-0.05611312i 0.24522791+0i
## [2,] -0.12075799+0.00972308i -0.12075799-0.00972308i -0.13069103+0i
## [3,] -0.04842071-0.02852412i -0.04842071+0.02852412i 0.02392394+0i
## [4,] 0.08665230+0.03275352i 0.08665230-0.03275352i -0.05216455+0i
## [5,] -0.49114543-0.09030051i -0.49114543+0.09030051i 0.38585562+0i
## [6,] 0.84671113+0.00000000i 0.84671113+0.00000000i -0.87784115+0i
v <- Re(eigs_left$vectors[, which.max(Re(eigs_left$values))])
w <- Re(eigs_right[["vectors"]][, dom.pos])
sens_mat <- v %*% t(w)/as.numeric(v %*% w)
sens_mat ### will include sensitivies to some transitions that do not exist## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.004125553 0.001484083 0.003037773 0.008411794 0.004814963
## [2,] 0.044080817 0.015857171 0.032458076 0.089878560 0.051447039
## [3,] 0.120768237 0.043443900 0.088925408 0.246240339 0.140949480
## [4,] 0.186725283 0.067170597 0.137491632 0.380723426 0.217928423
## [5,] 0.277863001 0.099955524 0.204599167 0.566548632 0.324295910
## [6,] 0.291784275 0.104963418 0.214849834 0.594933406 0.340543529
## [,6]
## [1,] 0.002630889
## [2,] 0.028110594
## [3,] 0.077014608
## [4,] 0.119075801
## [5,] 0.177194854
## [6,] 0.186072532
elas_mat <- proj_mat/lambda * sens_mat
elas_mat ### elasticities include only real transitions## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 0.000000000 0.000000000 0.00000000 0.000000000 0.00000000 0.004125553
## [2,] 0.004125553 0.000000000 0.00000000 0.000000000 0.00000000 0.011731618
## [3,] 0.000000000 0.014189313 0.05019030 0.011758060 0.01278773 0.000000000
## [4,] 0.000000000 0.001667858 0.03873511 0.281421339 0.03683779 0.022061331
## [5,] 0.000000000 0.000000000 0.00000000 0.078453380 0.18458406 0.061258469
## [6,] 0.000000000 0.000000000 0.00000000 0.009090647 0.09008632 0.086895561
Highest elasticity values rein [4,4] and [5,5]- surviving as medium and large 1-leaved individual. Maybe harvest as small 1-leaved indv. or as flowering adults…?
### set up the initial vectors and matrices to feed into the function
n_0 <- c(0, 10, 10, 10, 10, 10)
p_mat <- c( 0, 0, 0, 0, 0, 1.642,
0.098, 0, 0, 0, 0, 0.437,
0, 0.342, 0.591, 0.050, 0.095, 0,
0, 0.026, 0.295, 0.774, 0.177, 0.194,
0, 0, 0, 0.145, 0.596, 0.362,
0, 0, 0, 0.016, 0.277, 0.489) %>%
matrix(nrow = 6, byrow = TRUE)
stages <- c('seed', 'seedling', 'sm 1 leaf',
'med 1 leaf', 'lg 1 leaf', 'flower')source('plot_projmat.R')
plot_projmat(p_mat, stages = stages)### set up a function (based on Vincent's!):
proj_pop <- function(p_mat, n_0, yrs) {
### error checking
if(dim(p_mat)[1] != dim(p_mat)[2]) {
stop('Projection matrix should be square.')
}
if(length(n_0) != dim(p_mat)[1]) {
stop('Pop vector should be same dimension as matrix')
}
## Loop to project population
### initialize population as matrix
n_stages <- dim(p_mat)[1]
pop <- n_0
results_mat <- matrix(nrow = yrs, ncol = n_stages)
for (yr in 1:yrs){
### multiply projection matrix by current pop:
pred_pop <- p_mat %*% pop
### put results into results matrix (transpose the pop vector):
results_mat[yr, ] <- t(pred_pop)
### update population:
pop <- pred_pop
}
results_mat <- rbind(n_0, results_mat) ### add year zero
return(results_mat)
}
proj_pop2 <- function(p_mat, n_0, yrs) {
### for this one, try raising p_mat to the power of yrs
### error checking
if(dim(p_mat)[1] != dim(p_mat)[2]) {
stop('Projection matrix should be square.')
}
if(length(n_0) != dim(p_mat)[1]) {
stop('Pop vector should be same dimension as matrix')
}
## Loop to project population
### initialize population as matrix
n_stages <- dim(p_mat)[1]
pop <- n_0
results_list <- lapply(0:yrs, FUN = function(x) {
pred_pop <- expm::`%^%`(p_mat , x) %*% pop
### NOTE: expm::`%^%`(p_mat , x) is the same as (p_mat %^% x) but
### allows me to include the package namespace for reference
}) %>%
setNames(0:yrs)
results_mat <- bind_rows(results_list) %>%
t()
return(results_mat)
}# pop_projection <- proj_pop(p_mat, n_0, yrs = 10)
pop_projection <- proj_pop2(p_mat, n_0, yrs = 20)
pop_proj_df <- pop_projection %>%
as.data.frame() %>%
setNames(stages) %>%
mutate(year = 0:(n() - 1)) %>%
gather(class, pop, -year) %>%
mutate(class = fct_inorder(class))
ggplot(pop_proj_df %>% filter(year <= 10),
aes(x = year, y = pop, color = class)) +
geom_line(aes(group = class)) +
theme_classic() +
labs(x = 'year', y = 'abundance')ggplot(pop_proj_df, aes(x = year, y = pop, color = class)) +
geom_line(aes(group = class)) +
theme_classic() +
scale_y_log10() +
labs(x = 'year', y = 'log(abundance)')The straight lines on the log plot indicate all stages increasing at the same rate as the overall population (intrinsic rate of growth). Where it hits those straight lines, we’ve reached the stable stage distribution (and can calc that from the ratios of pop at any point thereafter).
The first, or dominant, eigenvalue is the long term asymptotic finite rate of increase λ. Its corresponding eigenvector provides the stable stage distribution.
First, do it by determining eigenvector corresponding with dominant eigenvalue. Then, do it by looking at long-term projections and determining the proportion of population in each class.
eigs_pmat <- eigen(p_mat)
### find the eigenvector that corresponds to the dominant eigenvalue
dom_pos <- which.max(eigs_pmat$values) ### typically the first value
w <- Re(eigs_pmat$vectors[ , dom_pos])
ssd <- w/sum(w)
round(ssd, 3)## [1] 0.168 0.061 0.124 0.343 0.196 0.107
### could also look at long-term ratios
stable_state <- pop_proj_df %>%
filter(year == max(year)) %>%
mutate(pct_of_pop = round(pop / sum(pop), 3))
stable_state$pct_of_pop## [1] 0.168 0.061 0.124 0.343 0.196 0.107
### find the dominant eigenvalue
lambda <- Re(eigs_pmat$values[dom_pos]) ### extract just the real component
lambda## [1] 1.047113
This tells us the overall asymptotic growth rate of the population. I believe the assumption is that the stable stage distribution is… stable? so each state also grows at rate \(\lambda\).
Let’s also look at the reproductive value of each stage, since it’s related. This is calculated as the dominant “left” eigenvector, i.e. \(\mathbf{vA} = \lambda \mathbf{v}\) (as opposed to the “right” eigenvector that is related to the asymptotic rate of pop increase, i.e. the \(\mathbf w\) in \(\mathbf{Aw} = \lambda \mathbf{w}\)). We can find the left eigenvector by finding the dominant eigenvector in the transpose of the projection matrix, i.e. t(p_mat).
eigs_t_pmat <- eigen(t(p_mat))
### find the eigenvector that corresponds to the dominant eigenvalue
dom_pos <- which.max(eigs_t_pmat$values) ### typically the first value
v <- Re(eigs_t_pmat$vectors[ , dom_pos])
ssd <- v/v[1] ### normalize so the first stage = 1
round(ssd, 3)## [1] 1.000 10.685 29.273 45.261 67.352 70.726
So as expected, the older an individual is, the greater its reproductive value - though the second-to-last and last stages are pretty close!
The stable stage distribution provides the relative abundance of individuals in each stage. Reproductive value provides the contribution to future population growth of individuals in each stage. Sensitivity and elasticity combine these to tell us the relative importance of each transition in determining λ.
First determine the sensitivity:
\[\frac{\partial \lambda}{\partial a_{ij}} = \frac{v_{ij} w_{ij}}{\mathbf{v \cdot w}}\]
### v and w vectors are calculated above...
vw_mat <- v %*% t(w)
v_dot_w <- as.numeric(v %*% w)
sens_mat <- vw_mat / v_dot_wthen this is weighted by the transition probabilities, i.e. \(\mathbf A / \lambda\): \[e_{ij} = \frac{a_{ij}}{\lambda} \frac{\partial \lambda}{\partial a_{ij}}\]
elas_mat <- p_mat / lambda * sens_mat ### why not %*%?
elas_mat %>% round(5)## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 0.00000 0.00000 0.00000 0.00000 0.00000 0.00413
## [2,] 0.00413 0.00000 0.00000 0.00000 0.00000 0.01173
## [3,] 0.00000 0.01419 0.05019 0.01176 0.01279 0.00000
## [4,] 0.00000 0.00167 0.03874 0.28142 0.03684 0.02206
## [5,] 0.00000 0.00000 0.00000 0.07845 0.18458 0.06126
## [6,] 0.00000 0.00000 0.00000 0.00909 0.09009 0.08690
From the book:
Although these values do not tell us which stages and transitions will, in reality, be influenced by natural phenomona or management practices, they provide us with the predicted effects on λ of a proportional change in a demographic rate, P or F. This is particularly important in the management of invasive (or endangered) species where we seek to have the maximum impact for the minimum amount of effort and resources [23,48].
| stage | pct of pop | total elasticity |
|---|---|---|
| seeds | 17% | 0.004 |
| seedlings | 6% | 0.016 |
| small one-leaved plants | 12% | 0.09 |
| med one-leaved plants | 34% | 0.38 |
| lg one-leaved plants | 20% | 0.32 |
| flowers | 11% | 0.19 |
Small one-leaved plants and flowering plants would be better to harvest than medium and large one-leaved plants. But seeds might be the best bet if they have the desired properties - very low elasticity and a sixth of total population at stable state.
Crouse et al. performed a demographic analysis of an endangered sea turtle species, the loggerhead (Caretta caretta). Management of loggerhead populations seemed essential for their long term survival, and a popular management strategy had been and still is to protect nesting females, eggs, and hatchlings. The ground breaking work by Crouse11 and her colleagues compiled data to create a stage-based projection matrix to analyze quantitatively which stages are important and least important in influencing long-term growth rate. This work led to US Federal laws requiring that US shrimp fishermen use nets that include Turtle Excluder Devices (TEDs, http://www.nmfs.noaa.gov/pr/species/turtles/teds.htm). Crouse et al. determined the transition matrix for their loggerhead populations:
\[A = \begin{pmatrix} 0 & 0 & 0 & 0 & 127 & 4 & 80\\ 0.6747 & 0.7370 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0.0486 & 0.6610 & 0 & 0 & 0 & 0\\ 0 & 0 & 0.0147 & 0.6907 & 0 & 0 & 0\\ 0 & 0 & 0 & 0.0518 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0.8091 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0.8091 & 0.8089 \end{pmatrix}\]
### set up the initial vectors and matrices
n_0 <- c(0, 10, 10, 10, 10, 10, 10)
p_mat <- c( 0 , 0 , 0 , 0 , 127 , 4 , 80 ,
0.6747 , 0.7370 , 0 , 0 , 0 , 0 , 0 ,
0 , 0.0486 , 0.6610 , 0 , 0 , 0 , 0 ,
0 , 0 , 0.0147 , 0.6907 , 0 , 0 , 0 ,
0 , 0 , 0 , 0.0518 , 0 , 0 , 0 ,
0 , 0 , 0 , 0 , 0.8091 , 0 , 0 ,
0 , 0 , 0 , 0 , 0 , 0.8091 , 0.8089) %>%
matrix(nrow = 7, byrow = TRUE)
stages <- c('egg', 'hatchling', 'baby', 'juvenile',
'young adult', 'adult', 'mature adult')# source('plot_projmat.R')
plot_projmat(p_mat, stages = stages)eigs_pmat <- eigen(p_mat)
### find the eigenvector that corresponds to the dominant eigenvalue
dom_pos <- which.max(eigs_pmat$values) ### typically the first value
w <- Re(eigs_pmat$vectors[ , dom_pos])
ssd <- w/sum(w)
round(ssd, 4)## [1] 0.2065 0.6698 0.1146 0.0066 0.0004 0.0003 0.0018
lambda <- Re(eigs_pmat$values[dom_pos])
lambda## [1] 0.945031
Because \(\lambda\) is less than 1, this tells us the population is declining.
eigs_t_pmat <- eigen(t(p_mat))
### find the eigenvector that corresponds to the dominant eigenvalue
dom_pos <- which.max(eigs_t_pmat$values) ### typically the first value
v <- Re(eigs_t_pmat$vectors[ , dom_pos])
ssd <- v/v[1] ### normalize so the first stage = 1
round(ssd, 2)## [1] 1.00 1.40 6.00 115.84 568.78 507.37 587.67
Then we can calculate the sensitivity matrix, then multiply that by the projection matrix (divided by \(\lambda\)).
vw_mat <- v %*% t(w)
v_dot_w <- v %*% w %>% as.numeric()
sens_mat <- vw_mat / v_dot_w
elas_mat <- p_mat / lambda * sens_mat ### again, why not %*%?
elas_mat %>% round(5)## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## [1,] 0.000 0.00000 0.00000 0.00000 0.01205 0.00032 0.03863
## [2,] 0.051 0.18069 0.00000 0.00000 0.00000 0.00000 0.00000
## [3,] 0.000 0.05100 0.11869 0.00000 0.00000 0.00000 0.00000
## [4,] 0.000 0.00000 0.05100 0.13851 0.00000 0.00000 0.00000
## [5,] 0.000 0.00000 0.00000 0.05100 0.00000 0.00000 0.00000
## [6,] 0.000 0.00000 0.00000 0.00000 0.03895 0.00000 0.00000
## [7,] 0.000 0.00000 0.00000 0.00000 0.00000 0.03863 0.22952
# pop_projection <- proj_pop(p_mat, n_0, yrs = 10)
pop_projection <- proj_pop2(p_mat, n_0, yrs = 50)
pop_proj_df <- pop_projection %>%
as.data.frame() %>%
setNames(stages) %>%
mutate(year = 0:(n() - 1)) %>%
gather(class, pop, -year) %>%
mutate(class = fct_inorder(class))
ggplot(pop_proj_df %>% filter(year <= 10),
aes(x = year, y = pop, color = class)) +
geom_line(aes(group = class)) +
theme_classic() +
labs(x = 'year', y = 'abundance')ggplot(pop_proj_df, aes(x = year, y = pop, color = class)) +
geom_line(aes(group = class)) +
theme_classic() +
scale_y_log10() +
labs(x = 'year', y = 'log(abundance)')The straight lines on the log plot indicate all stages increasing at the same rate as the overall population (intrinsic rate of growth). Where it hits those straight lines, we’ve reached the stable stage distribution (and can calc that from the ratios of pop at any point thereafter).